Finding Longest Increasing and Common Subsequences in Streaming Data

نویسندگان

  • David Liben-Nowell
  • Erik Vee
  • An Zhu
چکیده

In this paper, we present algorithms and lower bounds for the Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS) problems in the data streaming model. For the problem of deciding whether the LIS of a given stream of integers drawn from {1, . . . ,m} has length at least k, we discuss a one-pass streaming algorithm using O(k log m) space, with update time either O(log k) or O(log log m). For the problem of returning the actual longest increasing subsequence itself, we give a dlog(1 + 1/ε)e-pass streaming algorithm with update time O(log k) or O(log log m) that uses space O(k log m), for any ε > 0. We also prove a lower bound of Ω(k) on the space required for any streaming algorithm for LIS, even when the input stream is a permutation of {1, . . . ,m}. We discuss a simple LIS-based algorithm for LCS, and we also give several lower bounds on this problem, of which the strongest is the following: when the elements of two n-element streams are presented in an adversarial order, we need space Ω(n/ρ) to approximate the length of their LCS to within a factor of ρ, even when the two streams are permutations of each other.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DISCOVERY of LONGEST INCREASING SUBSEQUENCES and its VARIANTS using DNA OPERATIONS

The Longest Increasing Subsequence (LIS) and Common Longest Increasing Subsequence (CLIS) have their importance in many data mining applications. We propose algorithms to discover LIS and CLIS from varied databases. This work finds all increasing subsequences from the given database, find increasing subsequences in n sliding window, longest increasing sequences in one and more sequences, decrea...

متن کامل

Faster Algorithms for Computing Longest Common Increasing Subsequences

We present algorithms for finding a longest common increasing subsequence of two or more input sequences. For two sequences of lengths n and m, where m ≥ n, we present an algorithm with an output-dependent expected running time of O((m + nl) log log σ + Sort) and O(m) space, where l is the length of an LCIS, σ is the size of the alphabet, and Sort is the time to sort each input sequence. For k ...

متن کامل

A linear space algorithm for computing a longest common increasing subsequence

Let X and Y be sequences of integers. A common increasing subsequence of X and Y is an increasing subsequence common to X and Y . In this note, we propose an O(|X| · |Y |)-time and O(|X| + |Y |)-space algorithm for finding one of the longest common increasing subsequences of X and Y , which improves the space complexity of Yang et al. [I.H. Yang, C.P. Huang, K.M. Chao, A fast algorithm for comp...

متن کامل

A New Family of String Classifiers Based on Local Relatedness

This paper introduces a new family of string classifiers based on local relatedness. We use three types of local relatedness measurements, namely, longest common substrings (LCStr’s), longest common subsequences (LCSeq’s), and window-accumulated longest common subsequences (wLCSeq’s). We show that finding the optimal classier for given two sets of strings (the positive set and the negative set)...

متن کامل

A Fast Heuristic Search Algorithm for Finding the Longest Common Subsequence of Multiple Strings

Finding the longest common subsequence (LCS) of multiple strings is an NP-hard problem, with many applications in the areas of bioinformatics and computational genomics. Although significant efforts have been made to address the problem and its special cases, the increasing complexity and size of biological data require more efficient methods applicable to an arbitrary number of strings. In thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005